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Abstract 

Using hashing techniques, this paper develops a family of space-efficient 
Las Vegas randomized algorithms for fc-SUM problems. This family in- 
cludes an algorithm that can solve 3-SUM in 0(n 2 ) time and 0(y/n) 
space. It also establishes a new time-space upper bound for SUBSET- 
SUM, which can be solved by a Las Vegas algorithm in 0*(2' 
time and 0*(2 /3n ) space, for any /3 G [0, 

1 Introduction 

The /c-SUM problem on n numbers can be formulated as follows: Given k sets 
Si, S2, Sk with n integers each and a target t, find a\, a%, . . . , au such that 
for all i, at S Si and a; = t. Note that one common variant of the problem 

has only a single set S from which all elements in the solution are chosen from, 
but the two are easily reducible to each other. The /c-SUM problem can be 
trivially solved in 0(n ) arithmetic operations by trying all possibilities, and a 
more sophisticated solution runs in 0(n^ k ^ 2 ^ logn) time. How much faster can 
it be solved? This turns out to be a fundamental question, as the complexity of 
fc-SUM is related to the complexity of a number of other problems. 

Gajentaan and Overmars [3] classified many problems from computational 
geometry as "3SUM-hard" (i.e. there exists a o(n 2 ) reduction from 3-SUM 
to the problem in question) in order to indirectly demonstrate their difficulty 
Finding a subquadratic algorithm for any problem in this class of problems 
would immediately produce a subquadratic algorithm for 3-SUM. One example 
of such a problem is 3-POINTS-ON-LINE: Given a set of points in the plane, 
are there three collinear points? To reduce 3-SUM to this problem, map each 
x e S (using the single-set variation of 3-SUM) to the point (x,x 3 ), with the 
idea that a\ + + 0,3 = if and only if the points (ai,af), (02, a|), and (03, 03) 
are collinear. 
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fc-SUM is also fundamentally connected to several NP-hard problems. Pa- 
trascu and Williams [4] show that solving fc-SUM over n numbers in 0{n ^) 
time would imply that 3-SAT with n variables can be solved in 0(2°W) time. 
Schroeppel and Shamir [5] have shown how the SUBSET-SUM problem can be 
reduced to an (exponentially-sized) fc-SUM problem. Therefore, more efficient 
fc-SUM algorithms can be used to derive faster SUBSET-SUM algorithms. The 
SUBSET-SUM problem on n numbers can be formulated as follows: Given a set 
S of n integers and a target t, find a subset S' C S such that J^aeS' a = They 
then provide a space-efficient 4-SUM algorithm to yield a time and space efficient 
SUBSET-SUM algorithm. Schroeppel and Shamir also established a time-space 
tradeoff theorem for SUBSET-SUM algorithms that allowed them to provide a 
time/space upper bound of T ■ S 2 = 0*{2 n ) given that T > 0*(2"/ 2 ). This 
paper will prove a parallel tradeoff result for fc-SUM algorithms, and then use 
their SUBSET-SUM to fc-SUM reduction to find an improved time-space upper 
bound for SUBSET-SUM. 



Our Results 

The best known algorithm for 3-SUM takes 0(n 2 ) time, but also requires 0{n) 
space (to hold a sorted array of numbers). Can we use significantly less space 
and obtain the same running time? This paper also investigates the time-space 
tradeoffs for the general fc-SUM problem. Given some fixed time budget S, we 
wish to solve fc-SUM in time T and space S where T is minimized. 

We use hashing techniques to lower the space requirement for 3-SUM: 

Theorem 1.1. 3-SUM on n numbers can be solved by a Las Vegas algorithn^j 
in time 0(n 2 ) and space 0(y/n). 

These techniques also help lower the space requirements for the general fc- 
SUM problem on n numbers, albeit at the cost of some running time increase. 

Theorem 1.2. Let 5 < 1. We will not define f{x) here, but it is a function 
from Z + — > Z + . and fix) < x — y/x + 1. k-SUM on n numbers can be solved 
in 6(n k - 5( - k -V + n fc-<s(fc-i)+(<5/(*0-i)) time and 0(n s ) space by a Las Vegas 
algorithm. 

The bound on f{x) implies the following corollary when we let <5 = 1 : 

Corollary 1.3. k-SUM on n numbers can be solved in (5(n fe ~ v ^ +1 ) time and 
0(n) space by a Las Vegas algorithm. 

Here are a few sample values of /: /(3) = 2, /(4) = 2, /(10) = 7, and 
/(100) = 90. Substituting these values into Theorem 11.21 yields: 

Corollary 1.4. Let S < 1. Then 3-SUM on n numbers can be solved in 
6(n 3 - 2S +n 2 ) time and 0(n s ) space by a Las Vegas algorithm. 

1 Recall that algorithms are Las Vegas randomized if they always give correct results, but 
may take additional running time depending on the random numbers generated (but not 
depending on the choice of input). 
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Corollary 1.5. Let 8 < 1. Then 4-SUM on n numbers can be solved in 
0(n 4 - 3S + n 3 - 5 ) time and 0{n s ) space by a Las Vegas algorithm. 

Corollary 1.6. Let 6 < 1. Then 10-SUM on n numbers can be solved in 
O(n 10 ~ 9S +n 9 ~ 2<5 ) time and 0(n s ) space by a Las Vegas algorithm. 

Corollary 1.7. Let 5 < 1. Then 100-SUM on n numbers can be solved in 
O(n 100 ~ 99S + n"~ 9S ) time and 0(n s ) space by a Las Vegas algorithm. 

These space-efficient algorithms also imply new time/space upper-bounds 
for SUBSET-SUM: 

Theorem 1.8. There is a Las Vegas algorithm for SUBSET-SUM on n numbers 
that runs in 0*(2^~V¥) n ) time and 0*(2 f3n ) space, for [3 G [0, §]. 

This improves the tradeoff of Schroeppel and Shamir when S is sufficiently 
small. For example, when S = O*(2 01n ), Schroeppel and Shamir obtain 
T = O* (2 - 8 ™) while we obtain T — O* (2 a702n ). 

2 Preliminaries 

This section covers notation, basic fc-SUM algorithms, and hashing. 

2.1 Notation 

Suppression of polylogarithmic factors from polynomial functions is indicated 
with O. Suppression of polynomial factors from exponential functions is indi- 
cated with O*. 

The following definition is also useful for discussing merging the sets of k- 
SUM problems: 

Definition. When S and T are sets, the set S + T , also called the Minkowski 
sum of S and T, is defined as {s + t \ s e S, t e T}. 

2.2 Basic fc-SUM Algorithms 

We present several standard algorithms for fc-SUM on n numbers for k < 4. 
All of them are based around the following solution to 2-SUM that requires the 
input sets to be sorted: 

Lemma 2.1. Given a 2-SUM problem on n numbers where the elements of S\ 
can be accessed in nondecreasing order and the elements of S2 can be accessed 
in nonincreasing order, where T{n) is the time to access the next element of 
either S\ or S2, a solution can be found in 0(n ■ T(n)) time and 0(1) space. 

Proof. Let s\ denote an element of S\ and S2 denote an element of S2. Begin by 
setting si to the smallest element of Si and setting S2 to the largest element of 
5' 2 . If si + s 2 = t, then si and s 2 form a solution; return it. If Si + s 2 < t, then 
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advance s% to the next element of S\. Otherwise, if s± + s 2 > t, then advance 
s 2 to the next element of S 2 . Repeat this process until a solution is found or 
one of the sets is empty, in which case there is no solution. 

Correctness: Notice the algorithm processes elements of Si from smallest to 
largest elements of S 2 from largest to smallest, si only advances when it could 
not sum to t with any element left to be considered in S 2 . This occurs because 
si + s 2 < t implies that the sum of si with any element left in S 2 is strictly less 
than t. Similarly, s 2 only advances when it could not sum to t wit hany element 
left to be considered in S\ . 

If the algorithm exhausts either set Si or S 2 , then that set has no elements 
that could appear in a solution. Hence, there are no solutions. 

Running Time: Each comparison with t and element access removes one 
element to consider from cither Si or S 2 , so the algorithm requires 0(n ■ T(n)) 
time at most. 

Memory Usage: This algorithm only requires space to store counters and 
compute sums, which does not depend on n. 

This completes the proof. □ 

The algorithms for 2-SUM, 3-SUM, and 4-SUM are just reductions to the 
constrained 2-SUM problem required by Lemma 12.11 

Theorem 2.2. 2-SUM on n numbers can be solved in 0(n\ogn) time and 0(n) 
space. 

Proof. Sort the elements of Si and S 2 into arrays, and run the algorithm from 
Lemma [2.11 Sorting requires O(nlogn) time and 0(n) space, and note that 
element access can be done in constant time. □ 

Theorem 2.3. 3-SUM on n numbers can be solved in 0(n 2 ) time and 0{n) 
space. 

Proof. Sort the elements of Si and S 2 into arrays. For each element S3 G S3, 
use the algorithm from Lemma [2~T1 to search for t — S3. 

Sorting requires O(nlogn) time and 0(n) space. Invoking the algorithm 
from Lemma [2.11 tt, times requires 0(n 2 ) time and 0(1) space (element access 
can be done in constant time). □ 

Schroeppel and Shamir[5] devised the following 4-SUM algorithm: 

Theorem 2.4. J^-SUM on n numbers can be solved in 0(n 2 log n) time and 
0(n) space. 

Proof. The key data structure is a priority queue that supports inserting, delet- 
ing, and extracting the minimum in logarithmic time per operation and takes 
linear space (this is possible with a heap-based priority queue). One priority 
queue, PQi, processes the elements of Si + S 2 in non-decreasing order while 
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another priority queue, PQ2, processes the elements of S3 + S4 in non-increasing 
order. 

To do this for Si + S2, first sort S2 in non-decreasing order. For every 
i = 1, 2, . . . , I Si I, enqueue the pair (i, 1). The priority of any pair (i, j) is be 
Si[i] + S2U} (the sum of the i th element of Si and the j th element of the sorted 
S2, both of which are one- indexed) . Whenever the pair (i, j) is deleted, where 
j < IS2I, immediately insert the pair (i,j + 1). Since S2 is sorted in non- 
decreasing order, any pair will be inserted before the minimum priority in the 
queue is larger than the pair's priority. The elements of Si + S2 are therefore 
extracted in order of non-decreasing priority, which is to say in order of non- 
decreasing value. 

S3 + S4 is handled similarly, except S4 is sorted in non-increasing order and 
the priority queue is used to extract the maximum priority element. 

These priority queues reduce the problem to the form found in Lemma 12.11 
but each set now has n 2 elements. Accessing elements takes O(logn) time, so 
the final running time is 0(n 2 logn). 

The priority queues each use linear memory and only ever contain a linear 
number of elements, so the total memory usage is 0{n). □ 

2.3 Hash Functions 

Definition. A family of hash functions H = {h : U — > [m]} is said to be 
universal if for every x,y € U, if x ^ y then Prh^H\h(x) = h{y)\ < ~. 

Definition. Given a family of hash functions H — {h : U — > [m]}, and some 
set Set/, let the bucket of h with value v be h~~ 1 ({v}) (i.e. all elements with 
hash value v). Also, define Bu{x) := h~ 1 ({h(x)}) (the bucket of h with value 
h{x)). 

The following universal family of hash functions, H\, was first introduced by 
Dietzfelbinger [2] and applied to 3-SUM by Baran, Demaine and Patrascu pQ. 
It can be used on the elements of the input sets of fc-SUM so that only a subset 
of them need to be considered at once, saving memory. 

Definition. Given a word size w, a hash length s, and an odd integer a, let the 
hash function h a : U — > [2 s ] be defined as h a (x) := [ ax 2 "°^ 2 J ■ In C notation, 
this hash can be expressed as (a * x) » (w — s). 

Define the family of hash functions H\ := {h a | a € [2 W ], a odd}. 

3 Almost Linear Hashing 

This section covers certain useful properties of Hi, which will be used to con- 
struct Las Vegas algorithms for fc-SUM. Baran, Demaine, and Patrascu Q] gave 
the following two lemmas when applying Hi to 3-SUM: 

Lemma 3.1. The family of hash functions Hi satisfies almost-linearity, in that 
for all x,y £U, h(x + y) <E {h(x)\ © {h(y)} © {0, 1} (® is addition modulo 2 s ). 
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Proof. Multiplying by a is linear, and dropping the low-order bits can only 
influence the result by 1 due to losing the carry. □ 

This next lemma applies to any universal family of hash functions, and hence 
to Hi as well: 

Lemma 3.2. Given any universal family of hash functions H — {h : U — > [m]} 
and some set S c U of size n, the expected number of elements x € S with 
\Bh(x)\ > t is at most . „ 2 ", — 

1 n \ ' 1 — t — 2m/n+2 

Proof. Pick x £ S,y € S \ {x} randomly and let ph = Pr x [\Bh(x)\ > t] and 
qti = Pr x , v [h(x) = h(y)]. It suffices to show that p h < t _ 2n 2 /m+1 ■ 

Let S h = {x G S | \B h (x)\ < t}. Note \S h \ = (1 - p h )n. Notice that: 

Pr[h(x) - h(y) | x <£ S h ] > — 

n 

On the other hand, if x € Sh then also i/£ Sj,. By convexity of the square 
function, the collision probability of elements of Sh is minimized when the same 
number of elements of Sh hash to any value. In this case: 

M>ft>(i-?^-i 

m m 

Hence: 

Pr[h{x) = h { y) | x e S h ] > (1 ' PftWm ' 2 

n 

Combining yields the following: 

t-l (l-p h )n/m-2 

qh > Ph h (1 - Ph) 

n n 

1 n 

> -{p h (t - 1) + (1 -2p h )-- 2(1 -p h )) 
n m 

> - [p h (t-2— + 2 + 2 

7i mm 

By universality, < ^. The above inequality simplifies into: 

p h (t-2— + 2 ) + 2 < — 

m m m 



Pi, 



< 



t - 2n/m + 2 

This completes the proof. □ 

Lemma [3Tl guarantees that if (k — 1) sets have their hash buckets fixed, any 
solution that uses elements from those buckets could only have its last element 
in one of k buckets of the last set. Hence, hashing can be used to shrink the 
problem size with some limited growth in the number of cases. It is worth 
noting that this hash works best on 3-SUM, since for larger k applying the 
hash tends to increase the running time of the algorithm. It turns out that for 
large enough m, large buckets can be completely avoided by simply inspecting 
a constant number of hashes (in expectation) . 
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Corollary 3.3. Consider a universal family of hash functions H = {h : U — » 
[to]}, a set S C U of size n, where to < y/n, and an arbitrary constant c > 1. 
Then: 

n 2 

Pr heH {Vx e S : \B(x)\ < (c+ 2)-] > 1 - - 

TO Cr 

Proof. Let t = (c + 2) — . Let 6(/i) be the number of elements x £ S with 
|Bfc(a:)| > t. Applying Lemma O yields that E[b(h)] < c{n % )+2 <^f. Ap- 
plying a Markov bound yields Prh[b(h) > cm] < -3. However, if < cto 
then in fact b(h) = 0, since b(h) counts the number of elements in buckets 
of h with at least (c + 2)— elements (to < s/n implies — > to). Hence, 
Pr h [Vx : \B(x)\ <(c + 2)%\> 1 - Jr. This completes the proof. □ 

4 Las Vegas Algorithms for fc-SUM 

This section uses the hashing results to derive space-efficient Las Vegas algo- 
rithms for fc-SUM problems. Specifically, we demonstrate how to reduce the 
space usage of fc-SUM algorithms using Corollary 13.31 We use that result to 
derive a family of linear-space Las Vegas algorithms for fc-SUM. We then reap- 
ply that result to derive a set of sublinear-space Las Vegas algorithms that 
we will use later to establish new time-space upper bounds for SUBSET-SUM 
algorithms. 

Theorem 4.1. Let A be a Las Vegas algorithm that solves k-SUM (k > 3) 
on n numbers in T(n) time and S{n) space where T(n), S(n) € poly(n), and 
let 5 < 1 be an arbitrary constant. Then there is a Las Vegas algorithm A' 
that solves k-SUM on n numbers in 0(n fc-(5 ( fc ~ 1 ) + n k ~ sl > k ~ 1 ' ) ~ 1 T(n s )) time and 
0(n s + S(n s )) space. 

This theorem allows us to reduce the space usage of a fc-SUM algorithm by a 
factor of 6 at the cost of shrinking the gap between the running time and 0(n k ). 

Proof. The key idea is that to use hashing to reduce the size of each set by a 
square root factor at each step. However, storing any of the intermediate sets of 
this computation defeats the purpose of hashing any further. To avoid this, we 
first determine all hash functions and values to shrink each set to the desired 
size, and then compute the final sets in one step. 

A' will recursively construct a list L whose elements are of the form 
(h,Vi,V2, ■ ■ ■ ,Vk), i.e. a hash function followed by fc hash values (one for each 
Si). At any step, define the active set of Sj to be 

Si = {s € Si | h(s) — w,V(/i, v\, v 2 , ■ ■ ■ , Vk) G L}. Each element appended to 
L reduces the size of all active sets, so elements can be repeatedly appended 
until the active sets are only 0(n s ) in size, at which point it is safe to invoke A. 
To handle the possibility that S is not a perfect power of |, define the function 
s(x) :— max((^) x , 5). Step i of the algorithm will reduce the size of all active 
sets from 0(n s «) to 0{n s ^ l+1 ''>). 
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The recursive helper function R will construct L and then invoke A. It has 
access to all sets Si and does the following given a partially constructed L: 

1. Let I := \L\. If s(£) = 5, then compute Si, S2, ■ ■ ■ ,Sk and call A on them. 
Otherwise, the active sets Si, S2, ■ ■ ■ , Sfc, are guaranteed to each contain 
at most (k + 2) 2 n s ^' elements. 

2. Let V e := (k + 2)n s W _s ^ +1 ). Pick a random hash function h G Hi that 
maps to Ve values. For each Si and possible hash value v £ [Ve], iterate 
through all elements of Si, consider only the ones in Si, and count how 
many hash to the current v. If any count exceeds (k + 2) 2 n s ^ +l ^ elements, 
pick another hash and try again. 

3. For each (vi, V2, ■ ■ ■ , Vk-i) £ [V^^ 1 and j = 0, 1, . . . k — 1, let Vk equal 
h(t), less the sum of all already selected v^s, less j (mod Ve). Call R on 
L appended with (h, v\, v%, . . . , Vk)- 

Algorithm A' calls R with L = 0. 

Correctness: We first prove the size guarantee made when calling R. A' 
initially calls R with £ = and sets of size n < (k + 2) 2 n s ^ . R ensures that the 
hash it has chosen creates buckets that are no larger than (fc + 2) 2 n s ^ +1 ) in size, 
so it may safely append an additional element to L before making a recursive 
call to itself. 

We also want to show that if a solution exists, we will find it. Due to the 
almost linearity property of Hi , we know that a call to R where each element of 
the solution is in an active set will in turn make some recursive call where the 
elements are still in active sets. Since our first call to R is made with an empty 
L (and hence with all elements in active sets), we know that any elements of a 
solution will begin in active sets and hence will be found by the algorithm. 

Running Time: Checking that the buckets of a randomly-selected hash func- 
tion are not too large takes 0(n 1+s ( £ )~ s ^ +1 ') time since the algorithm needs to 
perform a linear scan for each hash value v £ [Ve]. We apply Corollary 13.31 with 
c = k, so we know the chance of a hash failing over a specific Si is at most 
p-; the chance of it failing over any Si, by a union bound, is at most |. Since 
k > 3, the expected number of hashes the algorithm needs to pick and check is 
at most three. Hence our expected time checking for hashes during a single call 
to R, not including recursive subcalls, is 0(n 1+s ^)™ s ^ +1 )). 

There is a single call where 1 = 0. Each recursive level of R makes 

calls to the level below it. Hence, there are O^*- 1 )^-^))) 
calls to R for a given I (all the terms cancel) The total expected time checking for 
hashes during all calls with a given I is therefore 0(n^ fc_1 ^ 1_s W^ + ^ 1+s ^ _s ^ +1 ^). 

(fc - l)(l - s{£)) + (1 + s(£) - s(£ + l)) = k- s(£)(k - 2) - s(£ + 1) 

<k-s(£ + l)(k-l) 
<k- 5{k - 1) 
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Hence, the total expected time checking for hashes during all calls with a 
given I is also 0{n k ~ 5 ( k ~^ ). Since the algorithm only searches for hash functions 
for I £ {0, 1, ... , |~log 2 i] — 1}, the total expected running time checking for 
hashes overall is 0(n fc ~ 5 ( fc_1 ) ). 

When s(£) = 6, we need to compute all Si. From our previously-derived 
formula, we know that there are only 0(n^ fe_1 ^ 1_l5 ' ) ) calls where this occurs. 
Computing all Si only requires a linear scan of each Si, so we can do this in 
time 0(n k - 5( - k -V). 

Finally, we invoke A O^" 1 ^ 1 ^) times on sets of size at most (k + 2) 2 n , 
so in total we use Oin^^^^T in 5 )) time making calls to A. 

The total time taken is hence 0(n k ~ 5( - k -V + n k - s( - k - 1 '>- 1 T(n s )). 

Memory Usage: Notice that L contains at most |~log 2 4] elements of size 
(k + 1) each, so it takes 0(1) space. The space needed to check the selected 
hash is also O(l), since we compute a count for only a single hash value at a 
time. 

Invoking A on sets of size at most (k + 2) 2 n s requires only 0(n s + S(n s )) 
space (to store the inputs along with the space needed by A). 

This completes the proof. □ 

This family of hash functions does particularly well when applied to 3-SUM. 
When applied to the basic 0(n 2 ) time, 0(n) algorithm, the space-usage de- 
creases without any running-time cost: 

Theorem 11.11 3-SUM on n numbers can be solved by a Las Vegas algorithm 
in time 0(n 2 ) and space 0(y / n). 

Proof. From Theorem l2.3[ we know 3-SUM can be solved in T(n) — 0(n 2 ) time 
and S(n) = 0(n) space. We apply Theorem 14.11 with S = 0.5, which yields a 
Las Vegas algorithm that solves 3-SUM in O(n 3_0 - 5(2) + n 3-o.5(2)-i n o.5-2^ Qr 
0(n 2 ) time and 0{yfn) space. □ 

Theorem 14.11 also yields a family of linear-space Las Vegas algorithms for 
fc-SUM problems, via the following intermediate corollary: 

Corollary 4.2. Let A be a Las Vegas algorithm that solves ki-SUM 
(ki > 3 ) on n numbers in 0(n a ) time and O(n) space for some constant a. 
Then there is a Las Vegas algorithm A' that solves (fci • k2)-SUM on n numbers 
in d(n klk2 ~ kl+1 +n fcifc2-fci+i+(a-fc2)) time and Q(n) space. 

Proof. Apply Theorem 14. II to A, choosing 8 = Hence, there is a Las Vegas 
algorithm A" that solves fci-SUM on n numbers in 
0( n fci-(fci-i)/fc 2 +n fci-(fc I -i)/fc a +a/fca-i) time and C^n 1 ^' 2 ) space. 

However, a (fci • fc2)-SUM problem on n numbers can be converted to a 
fci-SUM problem on n k2 numbers. For i g {1,2,..., fci}, we let S*- = Ylj=x S(i-i)k 2 +j 
(the Minkowski sum of a block of k-i sets), and we run A" on the sets S[ with 
the same target t. Note that we do not actually store all elements of the sets 
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SI, but rather compute them on demand in constant time as A" requires them, 
in order to avoid using too much memory. 

Our algorithm A' is to call A" on the sets S^. Since these are n k2 in size, the 
algorithm A' takes d{n klk2 ~ kl+1 + n klk2 ~ kl+1+( - a ~ k ^) time and 0{n) space, as 
desired. 

This completes the proof. □ 

Corollary 14.21 can be used to find a linear-space algorithm for k, given that it 
factors into k\ and ki and that we already have an algorithm for fci-SUM that 
runs in linear space. If k does not factor nicely, it is possible to brute-force over 
one set to reduce to (k — 1)-SUM: 

Lemma 4.3. Let A be an algorithm that solves k-SUM (k > 3) on n numbers in 
0(n a ) time and 0{n) space. Then there is an algorithm A' that solves (k + 1)- 
SUM on n numbers in 0(n a+1 ) time and 0(n) space. 

Proof. The algorithm A' is to guess one element s E Sk+i of the solution and 
then to run A on Si, . . . , Sk for the remaining elements, which now need to sum 
tof-s. □ 

We now construct a function f(k) such that we can produce a Las Vegas 
algorithm that can solve fc-SUM on n numbers in 0(n^ h ^) time and 0{n) space. 

Definition. Let f : Z+ Z+. Let /(l) = 1, /(2) = I, /(3) = 2, and /(4) = 2. 
For k > 4, let: 




max(/(fci),fc 2 ) 



Corollary 4.4. k-SUM on n numbers can be solved in 0(n^ k ^) time and 0(ri) 
space by a Las Vegas algorithm. 



Proof. The base cases are covered by Theorem 12.21 Theorem 12. 31 and Theo- 
rem [2]4] (and k = 1 is trivial). For all other k, we either get an algorithm from 
Corollary 14.21 or Lemma l4~3l □ 



Table [T] shows the first few values of f(k). The difference between k and f(k) 
is important because higher differences will permit better time/space trade- 
offs. Due to the construction of f{k), this value k — f(k) is nondecreasing 
in k (it is a maximum of its previous value and the result of applying Corol- 
lary I4.2[) . The values of k where k — f(k) first increases to a new value v 
(k = 8, 15, 24, 32, 40, 54, . . .) occur when k factors evenly into (v + 1) • f(v + 1) 
(e.g. 15 = 5 • 3) or when k factors evenly into (v + 2) • (f(v + 2) — 1) (e.g. 
32 = 8 • 4), whichever is smaller (the latter case occurs if f(v + 2) = f(v + 1)). 

f(x) has the following (coarse) upper bound: 

Lemma 4.5. For all x > 2, f(x) < x — ^/x + 1. 
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Table 1: Time Complexity Upper Bounds for Linear-Space fc-SUM Algorithms 
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Proof. Notice from Table Q] that this is true for x g [2,8]. We will now prove it 
for x > 9. 

Let y 2 be the largest perfect square that is at most x. By the definition of 
/, we know that f(x) < f(y 2 — 1) + (x — y 2 + 1). Hence, it suffices to show that 

f(y 2 -i)~y 2 < 

Since x > 9, y+1 > 4. Since k—f(k) is nondecreasing in fc, (y+1) — /(y+1) > 
2. Simplifying yields (y - 1) > f(y + 1). 

Notice that y 2 — 1 factors into (y + 1) • (y — 1). By our definition of /: 

f(y 2 ~l) < (y 2 -l)-(y + l)-(y-l) + l + maxCf(y + l),y-l) 

By our choice of y, though, this implies that: 

f(y 2 - 1) - y 2 < -y - 1 < -yft 
This completes the proof. □ 

Corollary 11.31 k-SUM on n numbers can be solved in d(n k ~^ +2 ) time and 
0(n) space by a Las Vegas algorithm. 

Proof. This is a direct consequence of Lemma 14.51 combined with Corollary 14.41 

□ 

Applying Corollary 14 . 1 1 once more to this linear-space family yields sublinear 
algorithms: 
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Theorem 11.21 Let 8 < 1. Then k-SUM on n numbers can be solved in 
Q( n k-6{k-i) + n k-S(k-l)+(Sf(k)-i)} time and (n s ) space by a Las Vegas al- 
gorithm. 

Proof. Corollary 14.41 states that there is a Las Vegas algorithm for fc-SUM that 
runs in 0(n^ k >) time and 0(n) space. Applying Corollary 14 . 1 1 then yields the 
desired result. □ 



5 SUBSET-SUM Time-Space Tradeoffs 

Schroeppel and Shamir[5] provided the following reduction from SUBSET-SUM 
to fc-SUM: 

Theorem 5.1. Let A be an algorithm that solves k-SUM on n numbers in 
0(n ak ) time and 0(n" k ) space for some constants a and (3. Then SUBSET- 
SUM on n numbers can be solved m O* (2 an ) time and 0*(2?> n ) space. 

Proof. Consider the following algorithm A': 

1. Given a set S with n elements, divide it into k sets Si, S%, . . . , Sk of ? 
elements each. For each set Si, compute the set Tj := {X^eS^ s I — 
Run A on Ti, T 2 , . . . , T k , t. 

Correctness: If there is some solution, the sum of its elements in any Si will 
wind up in some Tj, and hence A will be able to find a solution that sums to t. 
Note that it is possible to backtrack and recover the original elements used to 
generate the elements of the fc-SUM solution. 

Running Time: We call A on sets of size at most 2t, so A' takes 0*(2 an ) 
time. 

Memory Usage: We call A on sets of size at most 2*", so A' takes 0*(n /3n ) 
space. 

This completes the proof. □ 

They also proved a theorem regarding SUBSET-SUM (as well as other prob- 
lems in a specific class of NP-hard problems) that allowed trading increased 
running time in return for reduced space. Here is a fc-SUM analogue of that re- 
sult, which allows further improvement our space-time upper bound on fc-SUM 
(and via Theorem O SUBSET-SUM as well): 

Theorem 5.2. Let A be an algorithm that solves k-SUM on n numbers in 
T = 0(n ) time and S — 0(n^ k ) space for some constants a and f3. Then 
k-SUM on n numbers can be solved in any time/space combination along the 

tradeoff curve T ■ S'^ = 6{n k ), Vt{n ak ) <T <6{n k ). 
Proof. Consider the following algorithm A-y (0 < 7 < 1): 
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1. Divide each Si into n 1 7 regions of consecutive elements of size n 1 . 

2. For each way to choose exactly one region from each Si, run A on that 
choice. 

In particular, when 7 = 0, A 7 is just a brute-force search, while when 7 = 1, 
Ay reduces to algorithm A. 

Correctness: We exhaustively search every combination of regions, and we 
know each element in a solution must appear in some region. 

Running Time: Algorithm A is called n fc ^ 1-7 ^ times on problems of size 
0(n 7 ), so A 1 uses d{n k{ ~ 1 -^ +a "' k ) time. 

Memory Usage: Algorithm A is called on problems of size 0(n 7 ), so A 1 uses 
0(n^ lk ) space. 

Notice that: 

T' ■ S lL ^ = 0(n fc ( 1 ~ 7 ) +Q7fe n^ 7fci ? 2L ) 

_ Q(j^~ ik+ayk+'yk— ajk\ 

= 0(n k ). 

This completes the proof. □ 

We have a family of space-efficient Las Vegas algorithms for fc-SUM from 
Corollarv ll.2l Applying Theorem [ITU followed by Theorem 15 . 1 1 yields a piecewise 
upper-bound curve for SUBSET-SUM algorithms (due to the fact that k must 
be integer). To better understand the behavior of this curve, we formulate it as 
a tradeoff between the exponents of T and S, as follows: 

Theorem ll.8l There is a Las Vegas algorithm for SUBSET-SUM on n numbers 
that runs in O* (2 (1 "V / P)«) time and 0*(2 fin ) space, for B € [0, ^]. 

We first prove a lemma: 

Lemma 5.3. Given a constant j3 £ [0, -^j], there exists a k such that k-SUM 
on n numbers is solved by a Las Vegas algorithm that runs inT = (9(n( 1-7v/ ^ fc ) 
time and S = (5(n /3fc ) space, where 7 = ^/| ps 0.942809. 

Proof. Notice that T ■ = 0(n k ), so by Theorem l5.21 it suffices to show that 
there is a Las Vegas algorithm for fc-SUM on n numbers that runs in time T" 
and space S' where T < T and T ■ S' ^ = 6{n k ). For the remainder of the 
proof, we will let u) denote -7^. 

VP 

If uj < 2 then we are already done, since by Theorem 12 .41 we have a solution 
to 4-SUM on n numbers with T' = 0(n 2 ) and S' = 0{n) and our range for 8 
implies that "fyfjS < 0.5. Hence we may assume that lu > 2 for the remainder of 
the proof. 
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By CorollaryHH fc-SUM on n numbers can be solved in T" = 0(n k - si - k -^ + 
n k-5(k-i-f(k))-x^ time anc j 5"' — 0(n s ) space by a Las Vegas algorithm. Choose 
k = +1 and 5 = Since u> > 2, we know that k > 4 and so fc— 1 — f(k) > 

1. Hence the running time T" is in 0{n 

7 should be chosen to guarantee that rs j\f$h < S + 1. Equivalently, 7 should 
be chosen such that: 



7v //?fc < 5+1 
j 2 k < uj(S + 1) 

2 W 

7 s<u- 



7 2 < 



7 2 < 



w - 1 

(w - l)k 
, ,2 



(w- l)(w + 2) 



The right-hand side is minimized when ^ — 1 ^^" +2 ^ is maximized. Taking the 
derivative shows that this occurs when to — 4, so it is safe to pick 7 = 

This completes the proof. □ 

Applying Theorem 15.11 to Lemma 15.31 directly yields Theorem 11.81 
The following graph demonstrates the previous best-known trade-off curve, 
found by Schroeppel and Shamir 5, (labeled as basic 4-SUM) along with the 
piecewise upper-bound obtainable from Corollary 11.21 The graph also includes 
the time-space tradeoff as given by Theorem II .81 
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SUBSET-SUM Time-Space Tradeoff Curves 




Ti ne ( a) 



6 Conclusion 

An interesting open problem is whether there exists a deterministic algorithm 
that runs in the same time for the fc-SUM problem on n numbers. It might 
be easier to consider the fc-XOR problem, which is identical except that the 
elements are vectors from F£ instead of integers. For this variant there is a 
simple linear universal family of hash functions, and so it seems possible that 
there might be a way to derandomize the hash selection process. 

Another interesting question is whether the function f(k) can be further 
improved. The fact that there exists a 0(n 2 ) time and 0(n) space algorithm 
for 4-SUM does not match the pattern found in the rest of the table, suggesting 
that there might be more efficient algorithms for other k as well. 
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